# Low Latency Processing
Ten Vad
Apache-2.0
TEN VAD is a low-latency, lightweight, and high-performance streaming voice activity detection system, suitable for real-time voice processing scenarios.
Speech Recognition Other
T
TEN-framework
16
29
Omniparser V2.0
MIT
OmniParser is a universal screen parsing tool capable of interpreting/converting UI screenshots into structured formats to enhance LLM-based UI agent performance.
Image-to-Text
Transformers

O
microsoft
6,729
1,185
Llava Mini Llama 3.1 8b
Gpl-3.0
LLaVA-Mini is an efficient multimodal large model that significantly improves the efficiency of image and video understanding by using only 1 visual token to represent an image.
Image-to-Text
L
ICTNLP
12.45k
51
Pikachu
This is a voice conversion model based on RVC (Retrieval-based Voice Conversion) technology, capable of transforming input audio into Pikachu-style speech.
Speech Synthesis
Transformers

P
sail-rvc
2,216
0
Todoroki2333333
This is an RVC (Retrieval-based Voice Conversion) model designed for audio-to-audio conversion tasks.
Speech Synthesis
Transformers

T
sail-rvc
376
0
Spongebob
This is a voice conversion model based on RVC (Retrieval-based Voice Conversion) technology, which can convert input audio into SpongeBob's voice.
Speech Synthesis
Transformers

S
sail-rvc
15
1
Shrek
This is a voice conversion model based on RVC (Retrieval-based Voice Conversion) technology, capable of converting source speech into a target voice style.
Speech Synthesis
Transformers

S
sail-rvc
5,919
2
Rubberchicken
This is an RVC (Retrieval-based Voice Conversion) model designed for audio-to-audio conversion tasks.
Speech Synthesis
Transformers

R
sail-rvc
383
0
Kanyewest
This is a voice conversion model based on RVC (Retrieval-Based Voice Conversion) technology, capable of transforming input audio into Kanye West's vocal style.
Speech Synthesis
Transformers

K
sail-rvc
3,523
0
Justinbiebermw
This is an audio conversion model based on RVC (Retrieval-Based Voice Conversion) technology, specifically designed to transform input audio into Justin Bieber's vocal style.
Speech Synthesis
Transformers

J
sail-rvc
4,656
0
Erenyeager
This is a voice conversion model based on RVC (Retrieval-Based Voice Conversion) technology, capable of transforming input audio into a specific character's voice.
Speech Synthesis
Transformers

E
sail-rvc
693
0
Butters
This is an RVC (Retrieval-Based Voice Conversion) voice conversion model for audio-to-audio conversion tasks.
Speech Synthesis
Transformers

B
sail-rvc
20
0
Bakugo2333333
This is an RVC (Retrieval-Based Voice Conversion) model designed for audio-to-audio conversion tasks.
Speech Synthesis
Transformers

B
sail-rvc
687
0
Featured Recommended AI Models